Enabling Data Retrieval : by Ranking and Beyond

نویسنده

  • CHENGKAI LI
چکیده

The ubiquitous usage of databases for managing structured data, compounded with the expanded reach of the Internet to end users, has brought forward new scenarios of data retrieval. Users often want to express non-traditional fuzzy queries with soft criteria, in contrast to Boolean queries, and to explore what choices are available in databases and how they match the query criteria. Conventional database management systems (DBMSs) have become increasingly inadequate for such new scenarios. Towards enabling data retrieval, this thesis first studies how to fundamentally integrate ranking into databases. We built RankSQL, a DBMS that provides systematic and principled support of ranking queries. With a new ranking algebra and an extended query optimizer for the algebra, RankSQL captures ranking as a first-class construct in databases, together with traditional Boolean constructs. We invented efficient techniques for answering ad-hoc ranking aggregate queries. RankSQL provides significant performance improvement over current DBMSs in processing ranking queries and ranking aggregate queries. This thesis further studies how to enable retrieval mechanisms beyond just ranking. Our explorative study in this direction is exemplified by two novel proposals– One is to integrate clustering and ranking of database query results; the other is to support inverse ranking queries that provide ranks of objects in query context. Injecting such non-traditional facilities into databases presents non-trivial challenges in both defining query semantics and designing query processing methods. We extended SQL language to express such queries and invented partitionand summary-driven approaches to process them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه الگوریتمی مبتنی بر یادگیری جمعی به منظور یادگیری رتبه‌بندی در بازیابی اطلاعات

Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank has been shown to be useful in many applications of information retrieval, natural language processing, and data mining. Learning to rank can be described by two systems: a learning system and a ranking system. The learning system takes training data as input and constructs a ranking ...

متن کامل

Identifying and Ranking the Important Textual and Paratextual Elements in Fiction Retrieval

Purpose: The purpose of this study is to identify the textual and paratextual elements in retrieving fiction from the readers’ perspective in order to provide the most appropriate access points for the readers and to improve access to fictions based on the readers’ needs. Method: The current research is an applied study in terms of purpose, applying a mixed method that was conducted using the ...

متن کامل

Enabling soft queries for data retrieval

Data retrieval finding relevant data from large databases — has become a serious problem as myriad databases have been brought online in the Web. For instance, querying the for-sale houses in Chicago from realtor.com returns thousands of matching houses. Similarly, querying ‘‘digital camera’’ in froogle.com returns hundreds of thousand of results. This data retrieval is essentially an online ra...

متن کامل

Testing and Validating the Role of Interactive Information Retrieval Model in Faculty Members' psychological Enabling: A Case Study of Alborz University of Medical Sciences

The term "electromagnetic fields" (EMF) is a combination of electric and magnetic fields as a diagnostic method as well as a therapeutic tool with many advantages such as ease of operation and painlessness, very controllable, which today has found wide application in regenerative medicine and also cancer treatment.  In addition to organs such as nerves, hearts, and bones that have an electrica...

متن کامل

Fuzzy retrieval of encrypted data by multi-purpose data-structures

The growing amount of information that has arisen from emerging technologies has caused organizations to face challenges in maintaining and managing their information. Expanding hardware, human resources, outsourcing data management, and maintenance an external organization in the form of cloud storage services, are two common approaches to overcome these challenges; The first approach costs of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007